206 research outputs found
Subjective Annotation for a Frame Interpolation Benchmark using Artefact Amplification
Current benchmarks for optical flow algorithms evaluate the estimation either
directly by comparing the predicted flow fields with the ground truth or
indirectly by using the predicted flow fields for frame interpolation and then
comparing the interpolated frames with the actual frames. In the latter case,
objective quality measures such as the mean squared error are typically
employed. However, it is well known that for image quality assessment, the
actual quality experienced by the user cannot be fully deduced from such simple
measures. Hence, we conducted a subjective quality assessment crowdscouring
study for the interpolated frames provided by one of the optical flow
benchmarks, the Middlebury benchmark. We collected forced-choice paired
comparisons between interpolated images and corresponding ground truth. To
increase the sensitivity of observers when judging minute difference in paired
comparisons we introduced a new method to the field of full-reference quality
assessment, called artefact amplification. From the crowdsourcing data, we
reconstructed absolute quality scale values according to Thurstone's model. As
a result, we obtained a re-ranking of the 155 participating algorithms w.r.t.
the visual quality of the interpolated frames. This re-ranking not only shows
the necessity of visual quality assessment as another evaluation metric for
optical flow and frame interpolation benchmarks, the results also provide the
ground truth for designing novel image quality assessment (IQA) methods
dedicated to perceptual quality of interpolated images. As a first step, we
proposed such a new full-reference method, called WAE-IQA. By weighing the
local differences between an interpolated image and its ground truth WAE-IQA
performed slightly better than the currently best FR-IQA approach from the
literature.Comment: arXiv admin note: text overlap with arXiv:1901.0536
Recovering Missing Coefficients in DCT-Transformed Images
A general method for recovering missing DCT coefficients in DCT-transformed
images is presented in this work. We model the DCT coefficients recovery
problem as an optimization problem and recover all missing DCT coefficients via
linear programming. The visual quality of the recovered image gradually
decreases as the number of missing DCT coefficients increases. For some images,
the quality is surprisingly good even when more than 10 most significant DCT
coefficients are missing. When only the DC coefficient is missing, the proposed
algorithm outperforms existing methods according to experimental results
conducted on 200 test images. The proposed recovery method can be used for
cryptanalysis of DCT based selective encryption schemes and other applications.Comment: 4 pages, 4 figure
An Improved DC Recovery Method from AC Coefficients of DCT-Transformed Images
Motivated by the work of Uehara et al. [1], an improved method to recover DC
coefficients from AC coefficients of DCT-transformed images is investigated in
this work, which finds applications in cryptanalysis of selective multimedia
encryption. The proposed under/over-flow rate minimization (FRM) method employs
an optimization process to get a statistically more accurate estimation of
unknown DC coefficients, thus achieving a better recovery performance. It was
shown by experimental results based on 200 test images that the proposed DC
recovery method significantly improves the quality of most recovered images in
terms of the PSNR values and several state-of-the-art objective image quality
assessment (IQA) metrics such as SSIM and MS-SSIM.Comment: 6 pages, 6 figures, ICIP 201
DeepFL-IQA: Weak Supervision for Deep IQA Feature Learning
Multi-level deep-features have been driving state-of-the-art methods for
aesthetics and image quality assessment (IQA). However, most IQA benchmarks are
comprised of artificially distorted images, for which features derived from
ImageNet under-perform. We propose a new IQA dataset and a weakly supervised
feature learning approach to train features more suitable for IQA of
artificially distorted images. The dataset, KADIS-700k, is far more extensive
than similar works, consisting of 140,000 pristine images, 25 distortions
types, totaling 700k distorted versions. Our weakly supervised feature learning
is designed as a multi-task learning type training, using eleven existing
full-reference IQA metrics as proxies for differential mean opinion scores. We
also introduce a benchmark database, KADID-10k, of artificially degraded
images, each subjectively annotated by 30 crowd workers. We make use of our
derived image feature vectors for (no-reference) image quality assessment by
training and testing a shallow regression network on this database and five
other benchmark IQA databases. Our method, termed DeepFL-IQA, performs better
than other feature-based no-reference IQA methods and also better than all
tested full-reference IQA methods on KADID-10k. For the other five benchmark
IQA databases, DeepFL-IQA matches the performance of the best existing
end-to-end deep learning-based methods on average.Comment: dataset url: http://database.mmsp-kn.d
Critical analysis on the reproducibility of visual quality assessment using deep features
Data used to train supervised machine learning models are commonly split into
independent training, validation, and test sets. In this paper we illustrate
that intricate cases of data leakage have occurred in the no-reference video
and image quality assessment literature. We show that the performance results
of several recently published journal papers that are well above the best
performances in related works, cannot be reached. Our analysis shows that
information from the test set was inappropriately used in the training process
in different ways. When correcting for the data leakage, the performances of
the approaches drop below the state-of-the-art by a large margin. Additionally,
we investigate end-to-end variations to the discussed approaches, which do not
improve upon the original.Comment: 20 pages, 7 figures, PLOS ONE journal. arXiv admin note: substantial
text overlap with arXiv:2005.0440
SUR-Net: Predicting the Satisfied User Ratio Curve for Image Compression with Deep Learning
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.The Satisfied User Ratio (SUR) curve for a lossy image compression scheme, e.g., JPEG, characterizes the probability distribution of the Just Noticeable Difference (JND) level, the smallest distortion level that can be perceived by a subject. We propose the first deep learning approach to predict such SUR curves. Instead of the direct approach of regressing the SUR
curve itself for a given reference image, our model is trained on pairs of images, original and compressed. Relying on a Siamese
Convolutional Neural Network (CNN), feature pooling, a fully connected regression-head, and transfer learning, we achieved
a good prediction performance. Experiments on the MCL-JCI dataset showed a mean Bhattacharyya distance between the
predicted and the original JND distributions of only 0.072
- …